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ABSTRACT 


The  National  Aeronautic  and  Space  Administration  (NASA)  and  its  prime 
contractors  currently  use  a  software  tool  called  RMAT  (the  Reliability  and 
Maintainability  Assessment  Tool)  for  the  forecasting  of  Orbital  Replacement  Unit 
(ORU)  failure  rates  and  associated  maintenance  demands  for  the  International 
Space  Station  (ISS).  This  thesis  introduces  a  new  model:  CMAM  (the 
Comparative  Maintenance  Analysis  Tool),  which  was  developed  to  replicate 
some  of  the  basic  functionality  of  RMAT  in  order  to  provide  a  comparative  look  at 
RMAT  results.  The  CMAM  program,  developed  in  Visual  Basic.net  and 
dynamically  linked  to  a  Microsoft  ACCESS  database,  focuses  on  a 
representative  set  of  critical  Crbital  Replacement  Units  (CRUs  that  represent  key 
items  that  require  both  internal  and  external  maintenance  in  both  pressurized  and 
un-pressurized  storage)  and  generated  failure  rate  data  for  each  critical  CRU. 
The  results  of  the  CMAM  model  are  then  compared  with  the  failure  rates 
generated  by  RMAT  program  for  the  same  set  of  critical  CRUs.  These  two 
independently  developed  sets  of  data  are  then  analyzed  against  historic  failure 
rates  for  these  ISS  parts. 

The  results  of  this  analysis  are  used  to  conduct  a  sensitivity  analysis  of 
both  the  CMAM  and  RMAT  programs  in  order  to  help  identify  the  primary 
contributing  factors  behind  divergence  issues  between  forecasted  failures  and 
associated  maintenance  from  actual  (historical)  failure  rates. 

Recommendations  are  provided,  based  upon  the  results  of  the 
comparison,  with  respect  to  the  sensitivity  of  RMAT  to  changes  in  certain  input 
parameters,  as  well  as  on  the  feasibility  of  implementing  CMAM  as  a 
comparative  tool  for  use  by  both  NASA  and  Boeing  Logistics  and  Maintenance 
(L&M)  personnel  for  the  purpose  of  RMAT  sensitivity  analysis,  as  well  as  use  in 
initial  operational  planning  for  optimizing  CRU  stocking  levels  while  awaiting 
more  comprehensive  RMAT  results. 
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EXECUTIVE  SUMMARY 


The  logistics  and  maintenance  of  the  International  Space  Station  (ISS)  is  a 
one  of  a  kind  system  with  over  5700  orbital  replacement  units  (ORUs)'',  and 
spare  parts  that  number  into  the  hundreds  of  thousands.  Parts  for  the  ISS  come 
from  127  major  US  vendors  and  70  major  international  vendors.  It  is  the 
responsibility  of  the  International  Space  Station  Logistics  and  Maintenance  (L&M) 
organization  at  Johnson  Space  Center  in  Houston  Texas  to  integrate  and  test 
these  spares  before  either  delivery  to  the  ISS  or  ground  spare  storage. 

The  objective  of  the  ISS  L&M  organization  is  to  define,  procure,  deliver, 
and  manage  the  resources  required  to  support  and  maintain  ISS  systems  and 
support  equipment  both  on-orbit  and  on  the  ground  during  assembly  and 
assembly  complete  operations  of  the  ISS.  In  order  to  meet  this  objective  NASA 
ISS  L&M  must  maintain  a  comprehensive  ORU  and  spares  database  with  up  to 
date  reliability  data  for  use  in  predicting  and  evaluating  on-orbit,  and  ground 
spares  requirements. 

The  primary  tool  used  for  this  purpose  is  the  Reliability  and  Maintainability 
Assessment  Tool  (RMAT).  RMAT  is  a  simulation  tool  that  generates  ORU 
failures,  quantifies  corrective  and  preventative  maintenance  requirements,  and 
quantifies  ISS  resources  needed  to  restore  the  ISS  to  an  operational  state. 
RMAT  is  the  ISS  Program/GAO  accepted^  tool  for  conducting  maintenance 
prediction  analysis  and  trade  studies,  and,  when  used  in  concert  with  an  accurate 
and  updated  ORU  database,  as  well  as  with  other  tools  such  as  Steady  Stated 
spreadsheets,  provides  a  robust  set  of  forecast  data.  However,  RMAT  is  a 

Although  the  total  number  of  ORUs  is  estimated  at  5700  the  ORU  database  (MADS)  lists 
1379  unique  ORU  types  available  for  reliability  analysis. 

2  The  U.S.  General  Accounting  Office  (GAO)  is  the  investigative  arm  of  Congress  whose 
mission  is  to  execute  audits,  surveys,  investigations  and  evaluations  of  Government  programs  to 
support  oversight  and  funding  decisions 

3  ISS  Steady  State  is  defined  as  after  Assembly  Complete  (AC)  operations 
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complex  program  written  in  an  obsolete  programming  language  (FORTRAN//) 
that  requires  a  high  degree  of  user  familiarity  in  order  to  produce  meaningful 
results,  and  to  summarize  those  results  for  decision-making  purposes.  RMAT 
requires  the  preparation  of  multiple  text-based  input  files  that  can  be  and  often  is, 
extremely  time  consuming.  Additionally,  while  recent  analysis  of  actual  versus 
predicted  internal  ORU  failures  shows  a  reasonable  amount  of  correlation, 
external  ORU  failures  show  an  increasing  divergence  between  RMAT  forecasted 
failures  and  actual  failures. 

The  ISS  Comparative  Maintenance  Analysis  Model  (CMAM)  was 
developed  to  replicate  some  of  the  basic  functionality  of  RMAT  in  the  areas  of 
Corrective  and  Preventative  ORU  failure  rate  forecasts  and  required  crew 
maintenance  time  requirements  in  order  to: 

-Gain  an  understanding  of  the  underlying  algorithms  used  by  RMAT  for 
failure  rate  generation 

-Provide  a  user  friendly  Graphical  User  Interface  (GUI)  based  program 
that  allows  for  the  generation  of  a  basic  set  of  comparative  results  against  the 
more  complex  and  comprehensive  RMAT  forecasts 

-Conduct  a  sensitivity  analysis  on  both  CMAM  and  RMAT  results  in  order 
to  identify  why  divergence  issues  have  arisen  between  external  failure  rates  and 
actual  failure  rates  while  internal  failure  rate  forecasts  remain  relatively  accurate. 

The  CMAM  program  developed  during  this  thesis  study  is  a  Visual 
Basic. Net  (VB.net)  based  program  that  allows  for  the  concurrent  editing  of  an 
Access  based  CRU  database,  the  querying  of  the  CRU  database  for  analysis  of 
specific  sets  of  CRUs,  and  the  subsequent  generation  of  corrective  maintenance 
(CM)  and  preventative  maintenance  (PM)  failure  rate  data  for  that  set  of  CRUs. 

To  analyze  the  inherent  uncertainties  of  CMAM  results,  a  representative 
set  of  CRUs  was  chosen  from  the  comprehensive  CRU  database  (MADS)^.  The 

4  MADS  or  the  Modeling  Analysis  Data  Set  is  the  comprehensive  ORU  data  set  with 
associated  ORU  reliability  data.  It  is  the  primary  responsibility  of  NASA  R&M  to  update  MADS 
and  it  is  used  extensively  by  NASA  and  Boeing  L&M  teams  for  reliability  analysis. 


representative  set  of  ORUs  consists  of  60  internal,  and  60  external  ORUs  with 
the  highest  criticality  code  (C1)5,  and  heaviest  by  weight  to  orbit®.  This  set  of 
ORUs  was  assumed  to  have  the  same  generic  failure  distribution  characteristics 
as  the  entire  set  of  ISS  ORUs. 

Comparison  of  CMAM  and  RMAT  results  in  terms  of  these  120  ORUs 
shows  a  7.5%  discrepancy  in  failure  rates  when  looked  at  over  the  life  of  the  ISS. 
Additionally,  external  ORU  failure  rate  predictions  of  the  two  programs  are  within 
2%,  internal  failure  rate  forecasts  show  an  approximate  8%  difference  (CMAM 
and  RMAT  show  a  relatively  close  correlation  when  overall  failure  rates  are 
compared. 

The  CMAM  program  is  most  sensitive  to  changes  in  Preventative 
Maintenance  timeframes  for  short  MTBF  (Mean  Time  Between  Failure)^  ORUs 
when  wear-out  failures  are  modeled.  Since  failure  rates  for  wear-out  failures 
using  a  Weibull  distribution  increase  exponentially  towards  the  end  of  life  of  an 
ORU  it  is  imperative  to  conduct  preventative  maintenance  of  these  parts  in  a 
timely  manner.  If  short  MTBF  ORUs  are  allowed  to  operate  until  failure  (w/o 
preventative  maintenance)  these  parts  will  produce  very  high  failure  rates  in  the 
forecast  model.  Analysis  of  the  120  ORUs  within  the  CMAM  database  reveals 
that  the  majority  of  planned  PM  is  for  internal  ORUs  while  external  parts  are 
more  often  allowed  to  operate  past  predicted  failure  (based  upon  criticality  of 
component).  Since  RMAT  uses  a  similar  Weibull  distribution  algorithm  and 
similar  shape  factors®,  and  if  the  critical  ORU  list  used  for  CMAM  results  is 
considered  to  be  “representative”  of  the  entire  ORU  list  used  by  RMAT  -  then  it 
can  be  assumed  that  RMAT  is  also  sensitive  to  Preventative  maintenance 

®  Criticality  Code  1  (Cl)  is  defined  as  a  single  point  failure  that  could  result  in  loss  of  Space 
Station  or  loss  of  flight  or  ground  personnel. 

®  Weight  to  orbit  was  an  arbitrary  choice  for  ORU  set  analysis,  and  was  made  early  in  this 
research  due  to  ORU/spare  up-mass  to  the  ISS  being  such  a  critical  factor  at  this  time. 

^  MTBF  for  this  thesis  is  defined  as  the  average  time  (in  hours)  that  a  component  works 
without  failure. 

®  The  Weibull  distribution  often  used  to  model  wear-out  failures  of  components  has  three 
parameters:  location,  size  and  shape. 
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scheduling,  especially  for  external  ORUs.  This  sensitivity,  along  with  inherent 
uncertainty  of  ORU  MTBF  values  may  be  enough  to  explain  the  divergence 
issues  between  external  failure  rates  and  actual  failures.  The  results  of  this 
sensitivity  analysis  are  discussed  in  detail  within  this  thesis. 

CMAM  could  potentially  be  used  in  concert  with  RMAT  to  provide  a  “first 
cut”  forecast  of  ISS  ORU  failures  and  crew  requirements  to  give  L&M  planners  a 
general  idea  of  what  failures  they  can  expect  while  waiting  for  the  comprehensive 
RMAT  results.  CMAM  results  can  also  be  used  as  a  sensitivity  check  of  RMAT 
for  random,  and  wear-out  failure  modes  for  predictions  of  required  corrective  and 
preventative  maintenance  actions.  These  comparative  results  could  lead  to  the 
rapid  determination  (and  the  corresponding  correction)  of  future  divergence 
issues  between  RMAT  results  and  historical  (actual)  ORU  failure  rates. 


I.  INTRODUCTION 


A.  PURPOSE 

Assembly  of  the  International  Space  Station  (ISS)  began  in  November 
1998  and  will  continue  until  completion  sometime  around  2010.  During  assembly 
and  over  the  ISS’s  nominal  10  year  lifetime  it  will  serve  as  an  orbital  platform  for 
the  United  States  and  International  Partners  to  make  advances  in  microgravity, 
space  life,  and  earth  sciences,  as  well  as  engineering  research  and  technology 
development.  The  utilization  of  the  ISS  for  creating  knowledge  and  technology  is 
an  enterprise  that  not  only  requires  the  initial  construction  of  a  safe  and  viable 
orbiting  laboratory,  but  also  requires  the  maintenance  of  this  one-of-a-kind 
system  on  a  continuous  basis  in  order  to  optimize  the  functional  availability  of 
systems  required  for  both  experimentation,  and  crew  life  support. 

The  ISS  Logistics  and  Maintenance  (L&M)  Organization  has  the  following 
philosophy: 

The  ISS  does  not  have  any  landing  gear,  is  not  a  satellite  exploring 
the  solar  system,  it  has  no  International  borders,  and  it  has  no 
organizational  lines.  It  is  one  Station,  that  must  be  supported  by 
ONE  crew,  twenty-four  hours  a  day,  seven  days  a  week,  three- 
hundred  and  sixty-five  days  a  year.9 

With  this  in  mind  the  NASA  and  Boeing  L&M  teams  rely  upon  developed,  tested, 
and  proven  modeling  programs  in  order  to  forecast  Orbital  Replacement  Unit 
(ORU)  failure  data  and  associated  maintenance  requirements  for  use  in 
operational  planning  and  to  determine  if  the  ISS  is  logistically  supportable  in 
current  and  future  configurations.  The  primary  tool  for  ORU  predictive  analysis  is 
the  RMAT  program. 

This  thesis  report  describes  the  RMAT  program  used  by  NASA  and 
Boeing  L&M,  and  introduces  a  new  predictive  tool  (CMAM),  developed  as  part  of 
this  research,  which  replicates  some  of  the  basic  functions  of  RMAT.  The 
objective  is  to  gain  a  better  understanding  of  the  RMAT  program,  and  to  develop 

^  NASA  Logistics  and  Maintenance  Overview  Briefing,  March  8,  2004 
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a  user  friendly  tool  for  conducting  quick  ORU  failure  analysis  and  sensitivity 
analysis  of  various  failure  parameters.  These  results  can  be  used  to  determine 
what  may  be  the  root  cause  of  divergence  issues  discovered  between  RMAT 
results  and  historical  failures  for  external  ISS  ORUs. 

B.  LOGISTICS  AND  MAINTENANCE  OF  THE  ISS 

The  ISS  has  over  5700  orbital  replacement  units  (ORUs),  and  spare  parts 
that  number  into  the  hundreds  of  thousands.  Parts  for  the  ISS  come  from  127 
major  US  vendors  and  70  major  international  vendors  and  in  most  cased  require 
the  shipment  of  these  parts  from  the  prime  vender  (often  called  the  Original 
Equipment  Manufacturer  or  OEM)  location  to  Johnson  Space  Center  (JSC)  in 
Houston,  TX  for  testing,  and  then  to  Kennedy  Space  Center  (KSC),  FL  for  follow 
on  delivery  to  the  ISS. 

The  objective  of  the  ISS  L&M  organization  is  to  define,  procure,  deliver, 
and  manage  the  resources  required  to  support  and  maintain  ISS  systems  and 
support  equipment  both  on-orbit  and  on  the  ground.  The  mission  statement  of 
the  L&M  team  is  two-part: 

Part  I:  During  Design  and  Development  Phase  to  define  necessary 
supportability  requirements  and  to  ensure  they  are  planned  for  and  met  in  order 
to  economically,  time  effectively,  and  safely  support  successful  operations. 

Part  II:  During  Operations  Phase  to  manage  logistics  resources  and 
conduct  maintenance  operations  that  ensure  that  the  on-orbit  vehicle  and  its 
associated  systems  support  safe,  successful  operations  and  utilization. 

The  On-Orbit  Ops  and  Maintenance  Re-supply  section  within  the  NASA 
L&M  organization  is  responsible  for  continuous  monitoring  of  up-mass  and  crew 
time  required  for  maintenance  and  on-orbit  stowage  of  spares  both  inside  and 
outside  of  the  Station.  This  section  uses  a  specific  tool  (called  the  Reliability 
Maintainability  Analysis  Tool  or  RMAT)  to  help  predict  and  evaluate  all  on-orbit 
maintenance.  ISS  L&M  management  found  that  initial  outputs  of  the  RMAT 
predictive  model  during  the  assembly  phase  (specifically  from  flights  2A  through 
12A)  show  that  these  flights,  and  the  continued  ISS  operations  and  assembly 
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have  limited  failure  tolerance  and  redundancy  especially  in  the  power  and 
thermal  systems  areas.  The  main  goal  of  the  On-Orbit  Maintenance  Re-supply 
team  is  not  simply  to  buy  more  spares  but  to  maximize/optimize  the  ability  to  re¬ 
supply  spares  quickly  when  needed  and  to  store  the  most  critical  ORUs  onboard 
in  the  proper  quantity  during  the  assembly  stages. 

The  ISS  Program  office  has  the  overall  responsibility  for  oversight  of  the 
three  main  ISS  contractors:  Boeing,  the  United  Space  Alliance  (USA),  and  the 
Blackhawk  Corporation.  Boeing  Logistics  and  Maintenance,  headquartered  in 
Houston,  TX  is  responsible  for  the  production  of  the  On-Orbit  Logistics  and 
Supportability  Assessment  Report  (LSAR).  The  On-Orbit  LSAR  is  a  bi-annual 
report  that  uses  historical  data,  and  predictive  analysis  (primarily  through  the  use 
of  RMAT)  to  make  assessments  of  the  ongoing  logistic  supportability  of  the  ISS. 
C.  LOGISTICS  SUPPORTABILITY  ASSESSMENT 

The  overall  goal  of  the  ISS  L&M  system  is  to  support  the  Station  within  the 
programs  limited  resources,  to  provide  a  safe  and  habitable  environment  for  the 
crew,  and  to  minimize  ISS  system  downtime  (downtime  that  impacts  the  function 
of  the  ISS  as  a  research  facility).  To  accomplish  this  goal  the  L&M  personnel 
provide  periodic  assessments  to  determine  the  resource  requirements  needed  to 
logistically  support  the  ISS  as  designed  and  built.  These  requirements  are 
summarized  in  the  On-Orbit  Logistics  Supportability  Assessment  Report  (LSAR). 
The  resources  for  maintenance  of  the  ISS  taken  into  account  within  the  LSAR 
includei'io 

•  Spares  and  spare  parts  (ORUs,  and  other  spares) 

•  Launch  locations  for  storage  of  spares 

•  Tools  for  performing  required  maintenance 

•  Extra-Vehicular  Activity  (EVA) 

•  Extra-Vehicular  Robotic  capability  (EVR) 

•  Transportation  assets  for  transport  of  supplies  to  the  ISS 

In  order  to  understand  the  LSAR  it  is  important  to  understand  the  two 
basic  ISS  maintenance  types,  and  the  ISS  3-Level  Maintenance  Concept. 

This  list  only  covers  major  resource  items  and  is  not  all  -inclusive. 
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Maintenance  for  the  ISS  will  be  either  Corrective  Maintenance  (CM),  or 
Preventative  Maintenance  (PM).  Corrective  Maintenance  is  maintenance 
performed  to  repair  or  replace  ORUs/spare  parts  that  fail  while  in  service. 
Preventative  maintenance  is  maintenance  performed  to  replace  ORUs/spare 
parts  that  have  a  specified  operational  life  (in  accordance  with  reliability  data), 
and  have  not  failed  yet  but  have  reached  the  end  of  their  operational  effective 
life. 

Within  the  ISS  3-Level  Maintenance  Concept  there  are  three  levels  of 
maintenance:  Organizational,  Intermediate,  and  Depot  maintenance. 

Organizational  Maintenance  is  either  corrective  maintenance  by  on-orbit 
replaceable  unit  removal  and  replacement,  or  in  situ  repair,  or  preventative 
through  scheduled  change  out  of  items,  service  or  inspection  in  order  to  maintain 
system  function  in  an  operational  condition  and  to  prevent  degradation  of  ISS 
performance.  Organizational  repair  can  occur  either  on-orbit  or  on  the  ground. 

Intermediate  Maintenance  is  corrective  maintenance  only  to  repair  ORUs 
by  disassembly,  repair  and  reassembly,  and  is  in  response  to  real-time 
requirements  for  a  work-around  solution.  This  type  of  maintenance  is  on-orbit 
and  internal  to  the  ISS  only. 

Depot  maintenance  is  corrective  maintenance  to  repair/overhaul  a 
designated  hardware  item  that  cannot  be  accomplished  at  the  other  maintenance 
levels  (this  requires  broken  ORUs/spare  parts  to  be  returned  to  earth  and  fixed  at 
either  one  of  the  4  NASA  depots,  or  back  at  the  OEM  facility). 

Thus  it  can  be  stated  that  there  are  only  two  levels  of  on-orbit 
maintenance  -  Organizational  which  consists  of  removal  and  replacement  of 
ORUs,  in  situ  repair,  servicing  and  manual  fault  isolation,  and  is  conducted  either 
within  the  ISS  (IVA),  thru  external  spacewalk  (EVA),  or  through  external  robotics 
(EVR),  and  Intermediate  which  consists  of  removal  and  replacement  of  ORUs  at 
a  maintenance  work  area  through  the  application  of  authorized  repair  kits,  and  is 
conducted  IVA  only. 
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The  major  logistics  resources  available  to  support  the  ISS  include:  ORUs/ 
spare  parts,  locations  for  storage  of  spares,  tools  for  performing  required 
maintenance,  Intra-Vehicular  Activity  (IVA),  Extra-Vehicular  Activity  (EVA),  Extra- 
Vehicular  Robotic  capability  (EVR),  and  transportation  assets  for  transport  of 
supplies  to  the  ISS. 

It  is  important  to  note  that,  for  the  purposes  of  the  ISS  On-Orbit  LSAR  and 
predictive  analysis,  the  ISS  post  assembly  complete  re-supply/logistics  support  is 
still  based  upon  four  (4)  shuttle  flights  per  year  (3  defined  as  mixed  mission 
(pressurized/un-pressurized),  and  1  completely  un-pressurized).  These  four 
flights  are  also  planned  for  the  assembly  phase.  The  assembly  phase  requires  a 
large  amount  of  new  hardware  (for  assembly  of  the  ISS)  and  thus  much  less 
capability  for  ORUs/spares  upmass.  Therefore,  it  is  expected  that  a  supply 
backlog  will  build  up  during  the  later  stages  of  the  assembly  phase,  and  will  take 
some  time  to  work-off  upon  Assembly  Complete.  This  backlog  consists  of 
maintenance  and  repair  actions  needed  to  restore  the  optimal  functionality  and 
redundancy  of  ISS  systems  (to  date  the  backlogs  have  consisted  primarily  of 
non-critical  ORUs  since  ISS  on-orbit  maintenance  is  scheduled  based  upon  the 
priority  (criticality)  of  the  task).  ORU  backlog  is  another  critical  element  in  the 
predictive  analysis  process. 

The  overall  goal  of  the  L&M  effort  is  to  maximize  the  availability  of  key 
functions  of  the  ISS  while  maintaining  a  safe  environment  for  the  crew.  Boeing 
L&M  and  NASA  utilize  a  tool  called  the  Station  Availability  Reporting  Tool 
(START)  to  provide  a  snapshot,  and  running  cumulative  tally  (monthly)  of  Station 
hardware  and  functional  availability.  The  10  key  functions  looked  at  for  this 
availability  report  include: 

•  Provide  usage  power 

•  Provide  002  removal 

•  Provide  Intra-module  Temperature  and  Humidity  Control  (THC) 

•  Provide  Internal  Thermal  Control  System  (ITCS)  Heat  Transfer 

•  Provide  Command  &  Telemetry  (uplink/downlink) 

•  Provide  Robotics  capability 
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•  Provide  Payload  Data  Downlink 

•  Provide  Command  and  control 

•  Provide  Extra-Vehicular  Activity  (EVA)  capability 

•  Provide  Fire  Detection/  Suppression 

These  functions  are  reported  in  four  separate  ways: 

•  Predicted  availability  (shows  a  prediction  of  CURRENT  availability 
of  the  function  based  on  current  sparing  levels) 

•  Performance  since  activation  of  the  function  (measured) 

•  Performance  for  the  last  6  months  (measured) 

•  Availability  Objective  (estimated  goals  primarily  for  the  performance 
since  activation) 

The  supportability  assessment  addresses  any  shortcomings  listed  on  the 
functional  availability  reports,  and  also  assesses  whether  future  functional 
objectives  will  be  attainable,  based  primarily  on  predictive  analysis.  The  primary 
predictive  analysis  tool  used  in  developing  the  LSAR  is  the  Loral  Reliability  and 
Maintainability  Assessment  Tool  (RMAT). 

D.  PREDICTIVE  ANALYSIS  USING  RMAT 

The  Reliability  and  Maintenance  Assessment  Tool  (RMAT)  is  a  Monte 
Carlo  Based  simulation  tool  used  to  project  maintenance  demands,  including 
maintenance  performed  and  resultant  backlog.  The  main  constraints  of  the 
RMAT  program  for  simulation  include  (but  are  not  limited  to): 

•  Available  spares 

•  Robotic  capability 

•  Available  weight/volume  to  orbit  (primarily  during  assembly  stage) 
RMAT  also  has  a  number  of  input  parameters  that  need  to  be 

entered  in  order  to  make  predictions.  There  are  19  parameters  that  can 

be  altered  to  affect  predictions.  However,  the  primary  parameters  include: 

•  Mean  Time  Between  Failures  (MTBF)  -to  predict  #  of  corrective 
actions  needed  per  ORU 

•  Mean  Time  To  Repair  (MTTR)  to  predict  #  of  crew  hours  needed 
(both  CM/PM  actions) 

•  Life  Limit  of  ORU 

•  Number  of  crew  members  required  to  perform  maintenance  task 
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•  Quantities  (of  spares  on  orbit) 

•  Reliability  class  (criticality) 

•  Duty  cycle 

•  Frequency  of  Preventative  Maintenance 

RMAT  takes  these  constraints  and  input  parameters  and  utilizes  Monte 
Carlo  processes/mathematical  algorithms  to  predict  ORU  failures,  predict 
corrective/preventative  maintenance  requirements,  predict  size  and  impact  of 
maintenance  backlog,  and  predict  ISS  resources  needed  to  keep  the  ISS  in  an 
operational  state. 

RMAT  uses  iterative  Monte  Carlo  simulation  (primarily  to  account  for 
inherent  uncertainties  within  ORU  predicted  MTBF  and  K-factor''''  values),  and 
mathematical  algorithms  (primarily  for  failure  distribution  calculations)  for 
maintenance  demand  forecasting.  An  RMAT  run  consists  of  600  iterations  of  a 
specific  set  of  constraints/input  parameters  placed  into  the  simulation  model  for  a 
specific  timeframe.  The  maintenance  demand/result  is  an  average  of  the 
iterations.  RMAT  generates  failures  of  three  basic  types: 

•  Infant  Mortality  failures:  failures  that  occur  at  a  higher  rate  early  in 
the  lifetime  of  the  hardware. 

•  Random  failures:  failures  that  occur  randomly  throughout  the  life  of 
the  ISS. 

•  Wear-out/life  limited  failures:  Failures  occurring  at  a  higher  rate  as 
they  approach  end  of  life. 

RMAT  uses  the  following  distributions  when  modeling  these  three  types  of 

failures: 

•  Exponential  distribution:  Random  Failures 

•  Weibull  distribution:  Infant  Mortality  and  Wear-Out  failures 

These  three  failure  types  and  their  respective  distributions  leads  to  an 
overall  ORU  life  cycle  curve  that  resembles  what  is  called  the  “bathtub”  curve: 


K-factor  is  a  multiplier  that  accounts  for  increased  equipment  maintenance  actions  not 
included  in  the  inherent  MTBF  estimates.  See  section  III.C  for  further  details. 
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Figure  1.  ORU  life-cycle  failures  versus  time  (from:  Beardmore) 

There  are  four  primary  outputs  of  the  RMAT  program; 

•  Predicted  maintenance  actions  required  by  flight  (these 
maintenance  actions  are  EVA/IVA/EVR  actions  required  in 
response  to  a  failure  or  a  scheduled  preventative  maintenance 
remove  and  replace  (PMRR),  or  from  required  servicing/inspection 
activity) 

•  Maintenance  Action  Backlog  (the  backlog  is  made  up  of  ORUs 
awaiting  maintenance  action  due  to  a  shortfall  in  resources  (on- 
orbit  spares/  Shuttle  up-mass/shuttle  flights), 

•  Predicted  crew  time  for  maintenance  actions  by  flight.  (This 
includes  EVA/IVA/EVR  man-hours  consumed  to  conduct 
maintenance  activities) 

•  Predicted  Up-Mass  requirements  by  flight  (this  includes  total  ORU 
spares  weight  that  will  be  launched  to  conduct  corrective  or 
scheduled  maintenance). 

E.  OVERALL  RESULTS  FROM  RMAT 

Simulation  results  for  the  internal  maintenance  (IVA)  of  the  ISS  show 
adequate  support  both  during  and  post  assembly.  RMAT  results  show  a  slightly 
negative  margin  between  the  number  of  maintenance  actions  required  and  the 
number  of  maintenance  actions  that  will  be  performed  leading  to  a  minimal 
backlog  buildup  for  IVA  maintenance  actions  during  the  assembly  stage.  Since 
RMAT  conducts  maintenance  based  on  ORU  priority,  none  of  the  ORU  items 
within  the  backlog  are  high  criticality  (01)  items.  RMAT  also  shows  a  highly 
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positive  margin  upon  assembly  complete  leading  to  the  rapid  work-off  of  this 
backlog.  This  positive  margin  during  post-assembly  is  due  to  the  fact  that  more 
up-mass  and  crew  time  can  be  allocated  for  spares  and  maintenance  (as 
opposed  to  assembly). 

External  maintenance  (EVA/EVR)  does  not  seem  to  be  as  well  supported. 
RMAT  results  show  negative  margins  between  maintenance  required  and 
maintenance  performed  both  during  assembly  and  post  assembly  complete. 
This  will  lead  to  a  continual  increase  in  EVA/EVR  maintenance  action  backlog. 
Specifically,  RMAT  predicts  that  the  ISS  will  require  an  average  of  70  external 
maintenance  actions  per  year  during  post  assembly  stage.  Of  these  70  an 
average  of  31 .5  will  be  performed  (based  on  available  up-mass  and  EVA/EVR 
crew  times). 

While  RMAT  seems  to  be  predicting  internal  maintenance  actions 
relatively  accurately,  there  is  a  growing  divergence  between  RMAT  external 
maintenance  predictions  and  actual/historical  failures  gathered  to  date.  When 
comparing  RMAT  predicted  results  to  historical  actuals  (for  both  IVA  and  EVA 
respectively)  the  following  results  were  seen: 

IVA:  Cumulative  IVA  forecasted  Corrective  Maintenance  (CM)  crew  times 
exceeded  actuals  by  3%,  while  actual  Preventative  Maintenance  (PM)  crew  times 
exceed  forecasts  by  approximately  11%.  Total  CM  /  PM  actions  turned  out  to  be 
within  15%  of  reported  actuals. 

EVA:  Cumulative  EVA  forecasted  CM  crew  times  exceeded  actuals  by 
over  95%,  while  forecasted  PM  crew  times  exceeded  actuals  by  100%  (no  actual 
PM  external  activities  were  recorded).  Average  CM/PM  EVA  actions  per  year 
were  forecasted  to  be  nearly  44  and  12  respectively,  while  only  5  EVA  CM 
actions  were  performed  in  total. 

NASA  and  Boeing  L&M  are  currently  examining  these  divergence  issues 
by  reviewing  reliability  data  and  RMAT  model  input  fields  in  order  to  determine 
the  most  sensitive  aspects  of  the  model  itself  and  to  estimate  the  effects  of 
variance  resulting  from  inconsistencies  between  model  and  on-orbit  maintenance 
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assumptions.  This  thesis  attempts  to  assist  in  this  activity  by  developing  an 
independent  reliability  model  that  replicates  some  of  the  basic  functionality  of 
RMAT  and  can  be  used  comparatively  to  determine  what  input  parameters  have 
the  greatest  effect  on  model  outputs. 
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II.  CMAM  DEVELOPMENT 


A.  OVERVIEW 

The  ISS  Comparative  Maintenance  Model  (CMAM)  is  a  Visual  Basic. Net  ® 
program  which  calculates  both  corrective  maintenance  (CM)  and  preventative 
maintenance  (PM)  action  requirements  for  ISS  Orbital  Replacement  Units 
(ORUs)  and  associated  crew  maintenance  time  requirements  (IVA/EVA/  EVR)''2. 
CMAM  was  developed  in  order  to  replicate  some  of  the  functionality  of  RMAT  in 
order  to  gain  a  better  understanding  of  the  algorithms  used  by  RMAT,  and  to 
provide  a  basis  for  assessing  the  sensitivity  of  the  two  programs  to  changes  in 
similar  input  parameters.  Additionally,  CMAM  is  meant  as  a  user-friendly  option 
to  the  much  more  complex  RMAT  program  for  the  understanding  of  general  ORU 
failure  rate  data.  The  process  followed  for  CMAM  development  required  the 
development  of  a  separate  ORU  database  constructed  in  Microsoft  Access  ® 
(see  Appendix  A)  which  was  populated  with  a  representative  set'll  of  ORUs  from 
the  entire  NASA/BOEING  L&M  ORU  Modeling  Analysis  Data  Set  (MADS). 
Upon  completion  of  the  CMAM  program,  output  data  was  gathered  from  both 
CMAM  and  RMAT  (based  upon  similar  input  parameters)  and  the  results 
compared.  Once  the  two  output  sets  were  compared,  a  sensitivity  analysis  was 
conducted  on  CMAM  by  altering  the  assumed  failure  rate  distributions  and 
associated  input  parameters  to  determine  the  effects  on  output  values. 
Additionally,  a  Monte  Carlo  process  Simulation  package  Crystal  Ball  ®  was  used 
to  quantify  the  uncertainties  inherent  within  CMAM  results.  Finally,  based  upon 
the  similarities  between  the  RMAT  and  CMAM  programs  for  calculating  a  narrow 
field  of  failure  rate  data,  parallels  were  drawn  between  the  two  programs. 


IVA  is  Intra-vehicle  activity,  EVA  is  Extra  Vehicular  Activity  or  “SpaceWalk”,  and  EVR  is 
Extra-Vehicular  Robotics. 

60  Internal  and  60  External  C1  ORUs  (120  total)  where  chosen  from  the  MADS  list  that 
were  thought  to  display  the  same  failure  rate  trends  as  the  entire  ORU  set. 
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B.  CMAM 

1.  Basic  Functionality 

CMAM  allows  for  the  calculation  of  both  Corrective  Maintenance  and 
Preventative  Maintenance  Actions  as  well  as  the  associated  required  crew 
maintenance  time  (in  the  areas  of  IVA/EVA/EVR)  for  a  single  ORU  or  a  specified 
set  of  ORUs.  The  following  figure  is  a  flowchart  functionality  diagram  of  CMAM: 


Figure  2.  CMAM  Functionality  Diagram 
CMAM  calculates  ORU  failure  rates  utilizing  both  random  failures  and 
wear-out  failures  for  each  ORU.  Flowever,  CMAM  does  not  model/calculate  infant 
mortality  failures  at  this  time.  It  should  be  noted  that  while  RMAT  has  the 
capability  of  modeling  infant  mortality  events  and  “bad  apple”  failures  it  can,  and 
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often  is  turned  offJ^  Unlike  RMAT,  the  CMAM  program  does  not  take  into 
account  a  spares  list  and  available  crew  time  -  thus  it  does  not  calculate  any 
type  of  backlog  (CM/PM  maintenance  action  or  crew  time  backlogs).  The 
following  is  a  diagram  that  shows  the  overall  functionality  of  the  RMAT  program 
for  comparison  with  CMAM: 


Early  failure  modeling  options  within  RMAT  include:  Fisher  Price,  Bad  Apple,  and  No 
Early  failure  options.  No  early  failure  option  is  often  used  due  to  ORU  burn-in  process  conducted 
by  part  manufacturers. 
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Figure  3.  RMAT  Functionality  Diagram  (from:  The  Boeing  Company) 
Lastly,  CMAM  is  dynamically  linked  to  a  Microsoft  Access  ®  based  ORU 
database  which  not  only  allows  for  the  updating  of  the  database  from  the  CMAM 
GUI  user  interface,  but  also  allows  for  the  real  time  querying  of  the  database  for 
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specific  sets  of  ORUs  for  failure  rate  calculation.  CMAM  is  designed  with  an 
easy-to-use  query  form  that  has  the  following  built-in  database  query  types: 

ORU  Search  By: 

•  Assembly  Flight 

•  ISS  Operational  Year  (Decimal  Dated  year) 

•  ISS  Subsystem 

•  ORU  name 

•  Internal  or  External  Component 

•  Entire  ORU  Database 

2.  Database  Connectivity 

The  CMAM  ORU  database  consists  of  three  tables:  an  ORU  master  parts 
list,  an  ISS  Flight  table,  and  an  ISS  Subsystem  table.  The  ORU  master  parts  list 
and  the  ISS  Flight  table  can  be  updated  or  modified  from  the  CMAM  user 
interface. 

The  current  CMAM  database  is  populated  with  a  “representative”  set  of 
ORUs  for  comparison  against  RMAT  and  to  allow  for  the  assumption  that  the 
primary  sensitivities  of  this  representative  set  will  also  be  the  primary  sensitivities 
of  the  ORU  database  as  a  whole.  For  the  purposes  of  this  thesis  it  was  time 
prohibitive  to  enter  the  entire  MADS  ORU  database  into  the  CMAM  ACCESS  ® 
database.  The  MADS  DB  consists  of  approximately  1380  separate  (unique) 
ORUs  with  59  separate  fields  for  each  ORU.  The  CMAM  database  uses  only  26 
of  these  59  fields.  Entry  of  all  1380  ORUs  using  the  CMAM  ORU  update  mask  is 
estimated  to  take  between  1.5  and  2  minutes  per  ORU  or  between  34.5  and  46 
hours  for  the  entire  ORU  database. 

The  CMAM  representative  ORU  set  comprises  approximately  8.7%  of  the 
MADS  database  (120  ORUs)  and  is  comprised  of  the  highest  criticality 
components  sorted  by  weight  and  volume  (i.e.  the  top  60  Criticality  Code  1  ORUs 
with  the  highest  volume  and  weight  requirements  to  orbit  where  chosen  from 
both  internal  and  external  ORU  parts  lists).  See  Appendix  1  for  further  database 
details. 
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3.  CMAM  Distributions 

a.  Overview 

A  user  considers  a  system  reliable  if  it  is  available  and  operational 
when  needed.  From  an  engineering  standpoint,  reliability  is  the  ability  of  a 
system  or  unit  to  perform  a  required  function  under  an  assumed  or  stated  set  of 
conditions,  for  a  specified  period  of  time.  Quantifying  reliability  is  achieved  from 
the  concept  of  reliability  as  a  probability  distribution.  The  probability  of  a 
component  surviving  to  a  time  t  is  the  reliability  R(t),  and  is  expressed  as: 

R(t)  =  #  surviving  at  instant  t  /  #  at  time  =0 

A  component  failure  can  be  classified  into  two  groups;  1.) 
Degradation  failures,  where  an  important  subcomponent  drifts  so  far  from  original 
tolerance  values  that  the  component  no  longer  functions,  or  2.)  Catastrophic 
failures,  where  the  component  reaches  end  of  life.  The  failure  rate  can  be 
expresses  as: 

f(t)  =  #  failing  per  unit  time  at  instant  t  /  #  surviving  at  instant  t 

The  failure  rate  can  therefore  be  defined  as  the  probability  of  failure  in  unit  time 
of  a  component  that  is  still  working  satisfactorily. 

CMAM  assumed  two  types  of  failure  rates  for  ORUs:  a  constant 
failure  rate  (to  model  the  random  failures  that  occur  during  the  intrinsic  life  of  the 
component),  and  an  exponentially  increasing  failure  rate  (to  model  the  wear-out 
failures  that  occur  towards  end  of  life  or  towards  the  end  of  the  intrinsic  life  of  the 
component).  The  CMAM  program  mimics  RMAT  by  using  the  exponential 
distribution  to  model  constant  rate  failures,  and  the  Weibull  distribution  for 
modeling  the  increasing  failure  rate  wear-out  failures. 

b.  The  Exponential  Distribution 

The  exponential  distribution  is  a  relatively  common  distribution  in 
reliability  engineering  that  models  the  behavior  of  components  that  have  a 
constant  failure  rate  and  results  in  the  component  having  a  reliability  that 


16 


exponentially  decreases  through  time.  The  following  equations  show  the 
exponential  distribution  and  its  important  characteristics: 


Exponential  Distribution  (2 -parameter): 


fit)  = 
where : 

X  =  failure  rate 

Y  =  loeation  parameter  =  flight  deeimal  date 
Reliability  =  R{t)  = 


Failure  Rate  =  X{t) 


'fitY 

U(oJ 


H  X  =  MTBF 


Figure  4.  Exponential  Distribution  Equations  (from:  Walpole) 


It  is  important  to  note  that  the  two-parameter  exponential  distribution  is  utilized 
and  coded  into  CMAM.  Since  ORUs  become  activated  at  different  times  (flight 
decimal  dates)  CMAM  is  NOT  a  steady  state  calculation  program. 

c.  The  Weibull  Distribution 

The  Weibull  distribution  is  a  general  purpose  distribution  used  to 
model  material  strength,  times-to-failure  of  electronic  and  mechanical 
components,  equipment  or  systems.  The  most  general  (3-parameter)  case  of  the 
Wiebull  distribution  was  utilized  in  CMAM  and  is  defined  by  the  following 
equations: 
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WeibullDistribution  (3 -parameter): 


fit) 


p 


f 


it-y) 


ri  y  rj  J 


Where  : 

(3  =  shapeparameter  =  beta 
y  =  loeationparamter  =  flightdeeimaldate 
T]  =  sealeparameter  =  eta 


_  t^ 

Reliability  =  R(t)  =  6  ^ 

(  /(O^ 

failure  rate  =  A,{t)  -  -  = 

*MTBF  =  y  +  r( - h  1) 

P 

*MTBF  =  MTBMAtotal 

f  \ 


I 

77 


r 

V 


{t-r) 


77  j 


71  = 


MTBMAtotal 


7 


J 


+00 


GammaFunction  -  r(7z)  =  j  e  ^  *  x”  ^ dx 


Figure  5.  Weibull  Distribution  Equations  (from;  Walpole)''^ 

The  Weibull  failure  rate  is  a  function  of  time,  however,  if  the  Weibull 
shape  factor  (P)  is  equal  to  1  the  Weibull  distribution  displays  a  constant  failure 
rate  and  is  in  every  characteristic  identical  to  the  exponential  distribution.  In  fact, 
shifting  the  Weibull  shape  factor  (P)  gives  indication  on  all  of  the  prevalent 
failures  modes: 


MTBMAtotal  (Mean  Time  Between  Maintenance  Actions-total)  is  the  adjustment  to  MTBF 
values  based  upon  the  WP-4  Rutherford  equation.  See  Section  2.4. a 
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•  p  <  1  indicates  infant  mortality  (poor  production  or  insufficient  burn  in) 

•  p  =  1  indicates  random  failures 

•  p  =  1  to  4  indicates  early  wear  out,  early  fatigue 

•  p  >  4  indicates  old  age  or  rapid  wear-out  at  end  of  life 

CMAM  allows  the  user  to  input  the  desired  p  value  and  is  meant  to  be  used  to 
model  wear-out  failures  -  thus  it  is  defaulted  to  5  (similar  to  the  RMAT  program). 

It  is  important  to  note  that  the  Weibull  scale  parameter  (y),  which  is 
imperative  in  determining  the  Weibull  failure  rate,  is  based  on  the  Gamma 
Function.  The  gamma  function  computation  will  be  discussed  in  the  next  section 
(CMAM  Algorithms). 

The  following  examples  show  the  Weibull  failure  rates  and 
cumulative  probability  of  a  component  surviving  (reliability)  over  time  for  both  a 
long  and  short  MTBF  ORU  with  a  beta  value  of  1: 


Chart  3:  V\lbibull  distribution  (Beta  =1)  of  faiiure  rate 
for  Breathing  Assy 


0.8 


3  0.4 
'to 

^  0.2 


-Seriesi  i 


CO  LO 

Year 


Chart  4:  Wbibuli  Distribution  (Beta  =  1)  Probability  of 
faiiure  for  Breathing  aparatus  assy 


■«-  CO  LO 


OJ  ■«-  CO  LO 


Figure  6.  Weibull  Distribution  Example,  Beta  =  1 
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When  Weibull  shape  factor  (P)  =  1  the  failure  rate  remains  constant,  and 
the  reliability  of  the  component  exponentially  decreases  with  time.  In  the  graphs 
above  it  is  important  to  note  how  quickly  the  short  MTBMA  ORU  (Breathing 
Apparatus  Assembly)  reliability  decreases  to  approximately  zero  over  the  life  of 
the  ISS'i®  (Chart  4),  while  the  long  MTBMA  ORU  (DC/DC  Converter)  still  has 
fairly  high  reliability  over  the  same  timeframe. 

The  following  examples  show  the  Weibull  failure  rates  and 
cumulative  probability  of  a  component  failing  (inverse  reliability)  over  time  for 
both  a  long  and  short  MTBF  ORU  with  a  beta  value  of  5: 


Figure  7.  Weibull  Distribution  Example,  Beta  =  5 


Life  of  ISS  for  this  thesis  study  includes  assembly  and  post  assembly  time  and  is 
approximated  at  26  years. 
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With  a  Weibull  shape  parameter  (P=5)  it  is  important  to  note  that  although 
the  failure  rates  increase  exponentially  in  a  similar  fashion,  the  scaling  of  those 
failure  rates  is  radically  different  for  short  as  opposed  to  long  MTBMA  ORUs 
(charts  1  and  3). 

4.  CMAM  Algorithms 

a.  The  Rutherford  Equation 

Each  assembly,  subassembly,  or  component  within  the  ISS  has  its 
own  inherent  reliability,  often  expressed  as  a  Mean  Time  Between  Failure 
(MTBF).  Often,  MTBF  by  itself  (without  any  modification)  is  used  as  the  basis  for 
determining  failure  rates.  This  practice  can  and  often  does  lead  to  unacceptable 
inaccuracies  in  actual  (and  forecasted)  failure  rates  due  to  two  factors:  usage, 
and  the  nature  of  reliability  data  itself. 

MTBF  is  by  definition  an  average  value  of  failure  times  based  upon 
a  universal  population  of  like  devices/components. MTBF  therefore  does  not 
take  into  account  duty  cycles  (component  hot  versus  cold  usage  rates),  human 
error  when  performing  corrective  maintenance  (K-factor),  or  other  life  limiting 
factors  (LifeLim).  Due  to  these  issues,  the  following  equations,  were  developed 
by  L&M  personnel  and  serve  as  the  basis  for  all  MTBF  corrections  within  CMAM 
for  corrective  maintenance  actions  and  are  summarized  as  the  WP-4  Rutherford 
equations: 


Taken  from  the  15  April  1991  Application  of  K-factor  to  Life  estimates  in  External 
Maintenance  Solution  Team  (EMST)  Steady  State  Algorithm  paper. 
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WP-4  Rutherford  Equation: 
OP  =  DC +  R-  {DC  *  R) 

OP  -  OperatingRatio 


DC  -  DutyCycle 

R  -  HOTtoColdMTBF  -  1/  35(assumed  value) 

MTBFhot 
MTBFadj  =  ■ 


OP 

MTTFadj  =  MTBFadj 

Lchar  —  LIFLIM  (yrs) 


f  Lchar  A 

-(8760* - ) 

^  MTBFadj 


V 


J 


f 


MTBMArandom  = 


A 


1 


1 


(- 


V  MTBFadj  MTTFadj  J 
f 


K  =  Kfactor 

MTBMAtotal  =  MTBMArandom 
^CM 


Lchar  \ 

-(8760* - ) 

^  _  g  MTBMArandom 


V 


J 


8760  *  QTY 

CM  /  year  = - 

MTBMAtotal 

^PM 


8760  *  QTY 

PM  /  year  = - 

MTBPMRR 

Figure  8.  Rutherford  Equations  (from;  McDonnell  Douglas  Space 

Systems) 

Once  again  it  is  important  to  note  that  Preventative  Maintenance 
(PM)  calculations  are  not  a  function  of  MTBMAtotal,  and  rely  upon  stated/ 
unmodified  Mean  Time  Between  Preventative  Maintenance  Remove  and 
Replace  (MTBPMRR)  times  only. 
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b.  The  Gamma  Function 

As  discussed  earlier,  the  scale  parameter  (y)  of  the  Weibull 


distribution  is  dependent  on  the  solution  to  the  Gamma  Function  which  is  defined 
by: 


+00 

Gamma  Function  =  r(w)  =  J  e  ^  *  x”  ^dx 

0 

Integration  by  parts 
u  =  X  ,dv  =  e  dx 
gives 

+00 

r {n)  =  -e~''x''~^  +  («  - 1)  J 

0 

+00 

=  («  - 1)  *  J  e”""  *  x”~^dx 
0 

for  \  n  >\  = 

reeursion  formula  =  r(w)  =  {n  —  l)r(w  —  1) 

Figure  9.  Gamma  Function  (from:  Walpole) 

The  recursion  formula  has  no  algebraic  solution  thus,  in  order  to 
code  the  Weibull  distribution  into  CMAM  an  estimate  for  the  solution  of  the 
Gamma  function  had  to  be  used.  An  estimate  of  the  solution  to  the  Gamma 
function  can  be  attained  through  the  use  of  Stirling’s  Asymptotic  Series.  Stirling’s 
series  is  as  follows: 


Stirling's  Asymptotic  Series 

r(«) :  V^(l4 


+ 


1 


139 


571 


\2n  2SSn"  51840z^  2488320z^ 


Figure  10.  Stirling’s  Asymptotic  Series  (from:  Beyer) 

This  asymptotic  series  in  the  form  above  is  a  series  expansion  of 

the  gamma  function  accurate  to  4  decimal  places,  which  provides  reasonable 

accuracy  in  failure  rate  calculation  for  the  purposes  of  CMAM  development. 

23 


5.  CMAM  Output 

The  CMAM  output  report  is  in  the  form  of  four  separate  text  files  that  show 
the  following: 

•  File  1 :  Maintenance  Actions  per  year 

•  File  2:  IVA  crew  time  requirements  per  year 

•  File  3:  EVA  crew  time  requirements  per  year 

•  File  4:  EVR  (Robotic)  crew  time  requirements  per  year 

Each  of  the  files  is  listed  per  ORU  (each  line  in  the  report  is  a  separate 
ORU)  with  calculations  listed  by  year  (the  year  is  listed  with  the  calculation 
immediately  to  the  right  of  the  year)''^.  Each  of  the  files  also  has  a  summary 
portion  that  lists  both  Overall  (TOTAL)  and  Average  per  year  calculations  (similar 
to  RMAT  calculation  results).  However,  CMAM  calculates  both  CM  and  PM 
actions  and  summarized  them  in  one  column  (unlike  RMAT  which  has  a  separate 
queue  (queue  1)  for  PM  calculations).  Figure  11  is  an  example  of  the  CMAM 
output  screen: 


The  Year  is  defined  as  the  Operational  year  of  the  ISS  with  time  =  0  defined  as  the 
decimal  date  of  the  first  assembly  flight  (AF-01A)  which  occurred  on  20  November  1998. 
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Year  of  ISS 

Calculation 

Operation 

(#  of  failures) 

ORU 

1 - -  I 


Figure  1 1 .  CMAM  Output  screen 
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III.  RMAT  SENSITIVITY  ANALYSIS 


A.  RMAT  VERSUS  CMAM  OUTPUT  COMPARISON 

An  attempt  was  made  to  directly  compare  the  output  of  RMAT  and  CMAM 
based  upon  an  identical  set  of  input  ORUs  (the  CMAM  top  120  critical  ORU  set), 
and  a  similar  set  of  input  parameters.  The  following  input  parameters  where 
normalized  for  RMAT/CMAM  output  comparison: 

•  Duration  of  failure  rate  calculations:  26  years/result  format  by  year 

•  Random  and  Wear-out  failures  calculated 

•  ORU  Beta  value  set  to  default  value  of  5 

•  Corrective  and  Preventative  Maintenance  calculated  per  ORU 

Since  RMAT  calculations  take  into  consideration  spare  ORU  availability 

and  crew  time  availability,  an  infinite  spares  list  and  infinite  crew  time  had  to  be 
assumed  within  RMAT  (unconstrained  spares  and  crew  time  run).  Additionally, 
the  bad  apple  and  infant  mortality  functions  of  RMAT  were  turned  off  for  the 
comparison  run.  The  following  table  is  a  summary  of  the  RMAT  preprocessor 
input  parameters: 


RMAT  Version  5.9.1  DATE:  08-15-2004  TIME:  18:40:51 

USER  NAME:  Brian  T.  Soldon _ 

DATA  DESCRIPTION:  Top120  ORU  output  for  CMAM  comparison _ 

<SPDM=  26.840>  <PHC  =  0.663>  <AC  =  32.874> 

1 .  LENGTH  OF  SIMULATION  (Years) .  26.000 

2.  NUMBER  OF  RUNS  (Minimum  for  post  processor  is  20).  500. 

3.  REPORT  BY  (1=TIME  PERIOD,  2=FLIGHT) .  1 

4.  IF  BY  TIME,  TIME  BETWEEN  REPORTS  (Months) . 12.000 

5.  TOGGLE  MANIFEST  FLAG  (M=MANIFEST,0=AC) .  M 

6.  EARLY  FAILURE  (P=OFF  1  =  FISHER  PRICE  2=BAD  APPLE)  .  0 

7.  REPAIR  FLAG  (1  =  REG  2=INF  3=INFw/ROB  4=RES  FILE)  .  3 

8.  STATION  EVA  ALLOCATION  (POST  PHC)  (#EVAs) .  10.0 

9.  TIME  TO  RENEWAL  OF  STATION  EVA  ALLOC  (Months) ..  ..  1.00 

10.  ROBOT  HRS7TIME  UNIT  (SPDM) .  20.00 

1 1.  MB  FLIGHT  TO  BEGIN  ROBOTIC  EVA  SUPPORT  (SSRMS)  ...  20 

12.  EVA  OVER  PACK  TIME  (Hours) .  2.00 

13.  THRESHOLD  FOR  PERFORMING  AN  EVA  (MAN’HRS) .  12.00 

1 4.  ACCOUNT  FOR  NONPRODUCTIVE  EVA  TIME  (Y/N) .  Y 

15.  DISPLAY  OR  CHANGE  IVA  PARAMETERS 

16.  SPARE  FLG  (0=NONE  1  =INIT  2=INF  G&S  3=INF  GRND)  ...  2 

17.  PRIORITY  TO  TRIGGER  UNSCHEDULED  EVA .  4 

18.  TOGGLE  SCREEN  OUTPUT  FLAG .  N 

19.  TOGGLE  BEEPING  FLAG  AT  THE  END  OF  SIMULATION  .  N 

20.  RANDOM  NUMBER  SEED .  10 

UAAAAAAAAAAAAAAAAAAAAAAAaaaAAAAAAAAAAAAAAAaAaAAAAAAA,i. 


Table  1.  RMAT  Input  parameters 
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Upon  execution  of  RMAT  failure  rate  calculations  it  was  determined  that, 
over  a  26  year  period  a  total  of  3069.83  corrective  and  preventative  maintenance 
actions  were  forecasted  with  an  average  of  118.07  actions  per  year.  Below 
summarizes  the  RMAT  output  results: 


RMAT  Version  5.9.1  DATE:  07-15-2004  TIME:  18:40:51 

USER  NAME:  Brian  T.  Soldon 

DATA  DESCRIPTION:  Top120  ORU  output  for  CMAM  comparison 

««<  MAINTENANCE  ACTION  SECTION  »»> 

MAINTENANCE  PERFORMED 

TIME  FLIGHT 

EVA  only 

EVRonly 

;  Co-op 

Tot  EVA 

jTot  EVR 

Tot  IVA 

Total  Maint  Actions 

i  1 

Top120 

1.000 

0.00 

0.00 

0.00 

0.00 

0.00 

2.01 

2.01 

2.000 

0.00 

0.00 

0.00 

0.00 

0.00 

4.02 

4.02 

3.000 

0.02 

0.00 

0.09 

0.11 

0.09 

74.14 

74.25 

4.000 

0.02 

0.00 

0.05 

0.07 

0.05 

105.04 

105.11 

5.000 

0.20 

0.00 

0.13 

0.33 

0.13 

104.39 

104.72 

6.000 

0.95 

0.00 

1.42 

2.37 

1.42 

112.61 

114.98 

7.000 

0.27 

0.00 

0.45 

0.72 

0.45 

120.94 

121.66 

8.000 

0.61 

0.00 

1.92 

2.53 

1.92 

127.43 

129.96 

9.000 

0.47 

0.00 

0.69 

1.16 

0.69 

120.74 

121.9 

10.000 

1.45 

0.00 

1.80 

3.25 

1.80 

131.84 

135.09 

11.000 

1.40 

0.00 

2.27 

3.66 

2.27 

129.61 

133.27 

12.000 

0.23 

0.00 

1.22 

1.45 

1.22 

121.14 

122.59 

13.000 

0.90 

0.00 

1.31 

2.20 

1.31 

133.99 

136.19 

14.000 

0.65 

0.00 

0.79 

1.44 

0.79 

131.18 

132.62 

15.000 

0.78 

0.00 

0.82 

1.60 

0.82 

136.37 

137.97 

16.000 

0.87 

0.00 

1.06 

1.94 

1.06 

128.33 

130.27 

17.000 

0.81 

0.00 

1.66 

2.48 

1.66 

133.08 

135.56 

18.000 

0.89 

0.00 

1.51 

2.39 

1.51 

131.67 

134.06 

19.000 

0.90 

0.00 

1.37 

2.27 

1.37 

133.3 

135.57 

20.000 

0.83 

0.00 

1.33 

2.16 

1.33 

131.22 

133.38 

21.000 

0.81 

0.00 

1.27 

2.09 

1.27 

132.83 

134.92 

22.000 

0.84 

0.00 

1.33 

2.17 

1.33 

130.82 

132.99 

23.000 

0.78 

0.00 

1.15 

1.93 

1.15 

132.54 

134.47 

24.000 

0.97 

0.00 

1.51 

2.49 

1.51 

150.11 

152.6 

25.000 

0.81 

0.00 

1.16 

1.97 

1.16 

132.68 

134.65 

26.000 

0.89 

0.00 

1.30 

2.19 

1.30 

132.78 

134.97 

TOTAL 

17.3 

0.00 

27.6 

45.0 

27.6 

3024.83 

3069.83 

AVERAGE 

0.67 

0.00 

1.06 

1.73 

1.06 

116.34 

118.07 

Table  2.  RMAT  Maintenance  Action  output  results 
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Upon  execution  of  CMAM  failure  rate  calculations,  it  was  found  that  RMAT 
was  forecasting  slightly  higher  failure  rates  than  CMAM  for  overall  failures  and 
averages  of  required  failures  per  year.  CMAM  estimated  92.5%  of  the  total  CM 
and  PM  failures  estimated  by  RMAT'i^.  However  it  was  also  determined  that 
CMAM  output  results  are  highly  sensitive  to  changes  in  preventative 
maintenance  remove-and-replace  (PMRR)  scheduling,  especially  when 
calculating  failures  on  short  MTBF  (MTBMAtotal)20  ORUs  over  long  periods  of 
time  (i.e.  duration  of  forecast  calculations  >  20  years).  This  modeling  sensitivity 
was  exemplified  through  changing  the  preventative  maintenance  schedule  of  just 
1  ORU  within  the  CMAM  database.  A  CMAM  run  was  executed  both  with  and 
without  preventative  maintenance  on  an  external  component:  Control  Moment 
Gyro  (CMG).  The  two  runs  resulted  in  nearly  a  50%  difference  in  total 
maintenance  actions  required  over  the  26  year  period  (all  attributed  to  increases 
in  corrective  maintenance  requirements  on  the  CMG).  The  following  tables  show 
the  results  of  the  CMAM  run  with  and  without  CMG  preventative  maintenance: 


RMAT  forecasted  failures  =  3069.83,  CMAM  forecasted  failures  =  2840.14  (CMAM 
failures  /  RMAT  failures  =  92.5%) 

20  MTBMAtotal  is  the  adjustment  to  MTBF  values  based  upon  the  WP-4  Rutherford  Equation 
discussed  in  Section  2.4. a. 
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CMAM  Version  3  DATE:  08-15-2004 

USER  NAME:  Brian  T.  Soldon 

DATA  DESCRIPTION:  Topi  20  ORU  output 

With  Pre\fintati\«  Maintenance  on  CMG  (external) 

CMAM  Version  3  DATE:  08-15-2004 

USER  NAME:  Brian  T.  Soldon 

DATA  DESCRIPTION:  Top120  ORU  output 

Without  Preventasive  Maintenance  on  CMG  (external) 

Time 

EVA 

IVA 

Total  Maint  Actions 

Time 

EVA 

IVA 

Total  Maint  Actions 

Top120 

Top120 

1 

0.208 

9.0323 

9.2403 

1 

0.208 

9.0323 

9.2403 

2 

0.208 

9.3975 

9.6055 

2 

0.208 

9.3975 

9.6055 

3 

1.0838 

67.9256 

69.0094 

3 

0.4593 

67.9256 

68.3849 

4 

1.8385 

100.7378 

102.5763 

4 

1.0471 

100.7378 

101.7849 

5 

1.8385 

100.7474 

102.5859 

5 

1.0896 

100.7474 

101.837 

6 

1.8385 

100.7698 

102.6083 

6 

1.2129 

100.7698 

101.9827 

7 

2.0077 

100.6449 

102.6526 

7 

1.6537 

100.6449 

102.2986 

8 

2.0077 

106.4453 

108.453 

8 

2.161 

106.4453 

108.6063 

9 

2.0077 

107.6574 

109.6651 

9 

3.0124 

107.6574 

110.6698 

10 

2.0077 

107.8614 

109.8691 

10 

4.3365 

107.8614 

112.1979 

11 

2.0077 

108.1665 

110.1742 

11 

6.2824 

108.1665 

114.4489 

12 

2.0077 

108.6041 

110.6118 

12 

9.0197 

108.6041 

117.6238 

13 

2.0077 

109.2142 

111.2219 

13 

12.7387 

109.2142 

121.9529 

14 

2.0077 

110.0387 

112.0464 

14 

17.6499 

110.0387 

127.6886 

15 

2.0077 

111.1268 

113.1345 

15 

23.9844 

111.1268 

135.1112 

16 

2.0077 

112.5334 

114.5411 

16 

31.9938 

112.5334 

144.5272 

17 

2.0077 

114.3163 

116.324 

17 

41.9501 

114.3163 

156.2664 

18 

2.0077 

116.5421 

118.5498 

18 

54.146 

116.5421 

170.6881 

19 

2.0077 

119.2811 

121.2888 

19 

68.8944 

119.2811 

188.1755 

20 

2.0077 

122.6077 

124.6154 

20 

86.5288 

122.6077 

209.1365 

21 

2.0077 

126.6026 

128.6103 

21 

107.4033 

126.6026 

234.0059 

22 

2.0077 

131.354 

133.3617 

22 

131.8924 

131.354 

263.2464 

23 

2.0077 

136.9525 

138.9602 

23 

160.391 

136.9525 

297.3435 

24 

2.0077 

143.4966 

145.5043 

24 

193.3146 

143.4966 

336.8112 

25 

2.0077 

151.0862 

153.0939 

25 

231.0992 

151.0862 

382.1854 

26 

2.0077 

159.8309 

161.8386 

26 

274.2011 

159.8309 

434.032 

TOTAL 

47.1693 

2792.973 

2840.1424 

TOTAL 

1466.878 

2792.973 

4259.8506 

AVE/yr 

1.814204 

107.422 

109.2362461 

AVE/yr 

56.4184 

107.422 

163.840403 

Table  3.  CMAM  Maintenance  Action  Output  results 
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CMG  With 
YEAR 

and  Without  Preventative  Maintena 
Maint  Actions  Required  With  PM 

nee 

Maint  Actions  Required  no  PM 

1 

0 

0 

2 

0 

0 

3 

0.6248 

4 

0.8 

5 

0.8 

0.0511 

6 

0.8 

0.1744 

7 

0.8 

0.446 

8 

0.8 

0.9533 

9 

0.8 

1.8047 

10 

0.8 

3.1288 

11 

0.8 

5.0747 

12 

0.8 

7.812 

13 

0.8 

11.531 

14 

0.8 

16.4422 

15 

0.8 

22.7767 

16 

0.8 

30.7861 

17 

0.8 

40.7424 

18 

0.8 

52.9383 

19 

0.8 

67.6867 

20 

0.8 

85.3211 

21 

0.8 

106.1956 

22 

0.8 

130.6847 

23 

0.8 

159.1833 

24 

0.8 

192.1069 

25 

0.8 

229.8915 

26 

0.8 

272.9934 

TOTAL 

19.0248 

1438.7338 

AVE/YR 

0.731723077 

55.33591538 

Table  4.  CMAM  CMG  Maintenance  Action  forecasts  w/  and  w/o  PM 
The  sensitivity  of  CMAM  to  changes  in  preventative  maintenance 
scheduling,  especially  on  short  MTBF  (MTBMAtotal)  ORUs  when  calculating 
failures  over  long  period  of  time,  can  be  attributed  to  characteristics  of  the 
Weibull  distribution  when  calculating  wear-out  failures.  As  discussed  earlier, 
when  the  Weibull  shape  factor  (P)  is  greater  than  four  (p  =  5  in  our  case)  it  results 
in  exponentially  increasing  failure  rates  as  components  age  (approach  end-of-life 
or  wear-out).  The  goal  is  to  schedule  preventative  maintenance  on  these 
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components  prior  to  corrective  maintenance  requirements  becoming 
unacceptably  high.  However,  if  no  preventative  maintenance  is  scheduled, 
corrective  maintenance  forecasts  on  these  components  will  continue  to  increase 
exponentially,  and  will  result  in  high  failure  rate  predictions  (unrealistically  high  in 
most  cases).  Thus,  it  can  be  said  that  if  a  p  value  of  five  is  used  in  wear-out 
failure  calculation,  it  is  imperative  to  accurately  schedule  preventative 
maintenance  on  short  MTBF  (MTBMAtotal)  ORUs. 

The  current  ORU  reliability  forecasting  issue,  as  stated  in  LSAR  revisions 
M,  N,  and  O,  is  that:  while  Cumulative  IVA  actual  versus  forecasted  PM  and  CM 
maintenance  actions  and  crew  times  remain  relatively  accurate  (within  23%  for 
PM,  and  8%  for  CM),  Cumulative  projected  EVA  crew  times  “grossly”  exceed 
actions  and  the  EVA  numbers  continue  to  diverge2'i.  It  seems  possible  that 
RMAT  may  have  the  same  sensitivity  to  preventative  maintenance  scheduling  as 
CMAM.  In  order  to  test  this  theory,  a  comparative  run  was  executed  in  RMAT 
for  CMC  failure  rates  both  with  and  without  preventative  maintenance  over  the 
same  period  of  time  (26  years)  and  with  the  same  Weibull  shape  parameter 
(P=5).  It  was  found  that,  although  corrective  maintenance  actions  increase  when 
no  PM  was  scheduled,  the  CM  actions  did  not  increase  exponentially  after  the 
CMC  wear-out  period.22  The  following  table  summarizes  the  results  of  the 
RMAT  run  on  the  CMC  with  and  without  preventative  maintenance: 


LSAR  (D684-10162-1-1,  Revision  O  details  this  divergence  issue  and  discusses  possible 
causes  on  pages  4-1  and  4-2. 

22  The  CMC  wear-out  period  is  defined  as  MTBMAtotal  »  6.5  years 
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CMG  With 

YEAR 

and  Without  Preventative  Maintena 
Maint  Actions  Required  With  PM 

ice-RMAT  results 

Maint  Actions  Required  no  PM 

1 

0 

0 

2 

0 

0 

3 

0.1 

0.7 

4 

0.12 

0.94 

5 

0.1 

0.98 

6 

0.14 

0.93 

7 

0.23 

1.07 

8 

1.07 

1.05 

9 

1.36 

1.09 

10 

1.41 

1.18 

11 

1.16 

1.13 

12 

0.92 

1.1 

13 

0.76 

1.02 

14 

0.58 

1.15 

15 

0.58 

1.1 

16 

0.69 

1.09 

17 

0.9 

1.05 

18 

1.05 

1.08 

19 

1.14 

1.09 

20 

0.98 

1.1 

21 

0.84 

1.09 

22 

0.8 

1.04 

23 

0.7 

1.11 

24 

0.71 

1.09 

25 

0.92 

1.04 

26 

0.93 

1.08 

TOTAL 

18.19 

25.3 

AVE/YR 

0.699615385 

0.973076923 

Figure  12.  RMAT  CMG  Maintenance  Action  forecasts  w/  and  w/o  PM 
When  an  RMAT  run  was  executed  on  all  of  the  120  representative  ORUS 
(the  CMAM  database)  with  and  without  PM  schedules,  it  was  found  that  RMAT 
corrective  maintenance  approximately  doubles  over  a  26  year  period  while 
CMAM  forecasted  CM  actions  tend  to  increase  exponentially  for  the  same  set  of 
ORUs.  Thus  it  can  be  said  that,  while  RMAT  is  not  nearly  as  sensitive  to  lack  of 
preventative  maintenance  on  short  MTBF  ORUs  as  CMAM,  it  does  tend  to 
increase  maintenance  action  requirements,  and  may  be  a  contributing  factor  to 
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the  overall  trend  of  forecast  versus  actual  failure  divergence,  especially  in 
reference  to  external  ORUs. 

An  analysis  of  the  RMAT  ORU  database  (MADS)  shows  that  of  the  1379 
distinct  ORUs  within  the  database  729  are  Interior  (IVA)  ORUs  and  650  are 
Exterior  (EVA)  ORUs.  Of  the  729  IVA  ORUs,  32  have  an  associated  Mean  Time 
Between  Preventative  Maintenance  Remove  and  Replace  (MTBPMRR) ,  while  of 
the  650  EVA  ORUs,  only  9  have  an  associated  MTBPMRR.  Even  more 
significant  is  that  although  exterior  components  on  average  have  longer  lives 
(longer  MTBF/MTBMAtotals),  there  are  still  71  EVA  ORUs  that  have  an  MTBF 
less  than  100000  hours  and  only  2  of  these  ORUs  have  an  associated 
MTBPMRR  (while  of  the  116  IVA  ORUs  with  an  MTBF  less  than  100000  hours, 
18  have  preventative  maintenance  schedules).  This  fact  by  itself  may  be  enough 
to  explain  the  divergence  issue  with  respect  to  external  ORUs  while  internal 
forecasts  remain  fairly  accurate. 

Lastly,  it  must  be  stated  that  the  simple  addition  of  EVA  preventative 
maintenance  on  short  MTBF  items  does  not  seem  to  be  the  appropriate  solution 
to  the  problem  of  exaggerated  forecasted  EVA  failure  rates  for  two  reasons: 

•  Flistorical/actual  EVA  maintenance  actions  (CM)  do  not  seem  to 
merit  the  addition  of  such  maintenance  (LSAR  revision  O  shows 
only  5  EVA  CM  actions  to  date) 

•  Additional  preventative  maintenance  on  EVA  components  is 
avoided  (if  possible)  because  it  is  inherently  dangerous  and  time 
consuming 

Thus,  it  seems  more  likely  that  the  use  of  a  lower  p  value  for  determining  wear- 
out  failures  should  be  explored,  especially  in  reference  to  short  MTBF  external 
components.  It  seems  highly  likely  that  p  values  closer  to  1  (constant  rate 
failures)  would  be  more  accurate  to  use  based  upon  the  historical/actual  failure 
rates  that  are  collected  as  the  ISS  matures. 

B.  CMAM  UNCERTAINTY  USING  CRYSTAL  BALL 

So  far  our  discussion  and  analysis  of  failure  rates  for  ORUs  has  centered 
on  the  distribution  of  time  between  failures  (failure  rate  modes  -  constant  rate 
and  wear-out),  as  opposed  to  the  accuracy  of  stated  MTBFs  and  associated 

34 


MTBMAtotals.  Due  to  the  unique  nature  of  the  ISS,  and  the  unique  components 
of  its  associated  systems,  reliability  analysis  of  the  ISS  as  a  system  and  its  ORUs 
individually  is  based  upon  predicted  MTBF  and  K  factor23  data.  This  ensures 
that  there  will  be  added  levels  of  uncertainly  in  ORU  failure  rates  forecasts.  The 
idea  is  to  quantify  this  uncertainly  to  the  highest  degree  possible  to  aid  in  logistics 
and  maintenance  action  planning. 

1.  Purpose  of  Crystal  Ball  Simulation  Package 

The  two  major  sources  of  uncertainly  for  failure  rate  calculations  relate  to 
inaccuracies  of: 

•  Mean  Time  Between  Failures;  The  Average  time  between  failures 
of  a  specific  ORU  based  upon  characteristics  of  the  ORU  itself 

•  K-Factor:  A  multiplier  that  accounts  for  increased  equipment 
maintenance  actions  not  included  in  the  inherent  MTBF  estimates. 
These  maintenance  actions  include;  human-induced,  environmental 
induced,  false  maintenance,  other  equipment  induced. 

For  the  purpose  of  CMAM  uncertainty  analysis,  these  two  input 
parameters  were  treated  as  variables  in  developing  failure  rate  estimates.  This 
was  accomplished  through  the  use  of  the  Crystal  Ball  ®  2000  program. 

Crystal  Ball  2000  ®  is  a  simulation  program  that  assists  in  analyzing  the 
risks  and  uncertainties  associated  with  forecasting  models.  Crystal  Ball  was 
chosen  for  this  uncertainly  analysis  for  the  following  reasons; 

•  It  allows  the  incorporation  of  all  assumptions  made  for  CMAM 
failure  rate  calculation  purposes 

•  It  allows  for  multiple  replications  as  needed  to  avoid  randomness 

•  It  provides  a  confidence  level  for  data  sensitivity  analysis. 

•  It  provides  a  means  of  analyzing  data  by  utilizing  dissimilar 
distributions  exclusive  of  the  probability  distributions  functions.24 

2.  Assumptions 

In  order  to  simplify  this  uncertainty  analysis  a  number  of  assumptions 
needed  to  be  made: 


23  MTBF  and  K-factor  (see  Section  III.C.1  for  definition)  are  considered  to  be  the  two  primary 
causes  of  uncertainty  in  failure  rate  calculation.  These  factors  are  taken  into  account  within 
RMAT  (RMAT  uses  Monte-Carlo  simulation  with  600  iterations  to  account  for  these  uncertainty 
factors). 

24  http://www.crvstalball.com/crvstal  ball/index. html,  May  15,  2003 
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•  Only  constant  failure  rate  Corrective  Maintenance  action 
requirements  were  looked  at  for  all  120  ORUs  within  the  CMAM 
database. 

•  All  120  ORUs  are  assumed  to  be  operational  on  the  ISS  (steady 
state  calculation) 

•  Both  the  MTBF  and  K-factor  for  each  ORU  have  a  normal 
distribution  about  the  stated  value  and  both  have  a  standard 
deviation  about  this  mean  of  10%. 

•  600  iterations  were  performed  for  each  ORU  (in  order  to  avoid 
randomness) 

•  All  ORUs  are  considered  independent  of  one  another,  and  equally 
mission  critical. 

3.  Crystal  Ball  Results 

Based  upon  constant  failure  rate  calculation  within  CMAM  a  total  of  6.38 
CM  maintenance  actions  per  year  can  be  expected  on  these  120  ORUs.  The 
following  figure  shows  the  CMAM  output  screen  for  steady  state  calculations: 


I^ORU  Comparative  Maintenance  Modet . 

Flight  List  Help 

ISS  Contparath  e  ^Maintenance 
JLnaCysis  MocCeC 


PLEASE  CHOOSE  OPU  FROM  DA  TABASE  OP  ENTER  A  NEW  ORU  FOR  STEADY  STA  TE  ANALYSIS 


Previous 


ORU  Item# 

il20 

LifeLimit 

[99999989999 

PattNum 

|82G0850-901 

K  Factor 

JiTi 

Part  Name 

:A10  Card 

Fit  Qty 

(T 

System 

]C8.DH 

ROBcode 

)2 

Criticality 

|C1 

ADJ  CMMTTR 

(088  ' 

EICM  (0=IVA,  1=EVA) 

ff 

R0B_MTTR 

[1098 

Weight  (lbs) 

)1.5 

ADDJVA_MTTR 

(055 

Volume  ((^3) 

io 

ECSCM 

[1.49 

MTBF 

j273400 

ICSCM 

)o 

Flight 

jAF-llA 

Beta 

J5 

ACDC 

|1 

DFRFAC  (R) 

(0Ti3 

“  Database  Options 

Record  1  of  120 

Next  1 

First  1 

Add  1 

Edit  1 

Find 


Last 


Delete 


Save 


■  Steady  State  Analysis 


Perform  CM  Steady 
State  Analysis 


T  otal  CM  Actions  per 
year;  6. 38053 


EVA  crew  time  per 
year;5.23510 


IVA  crew  time  per 
year:  38. 49435 

Robotic  crew  time  per 
year:1 8.06747 


Clear 


Exit  the  Program 


Figure  13.  CMAM  Steady  State  Output  Results 
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When  MTBF  and  K-factors  were  assumed  to  vary  (normally 
disturbed  w/  standard  deviation  of  .1)  the  following  results  were  attained: 


Forecast:  Total  CM  Actions  Required  Top  120 

Frequency  Chart 


4.5 


Forecast:  Total  CM  Actions 
Required  Top  120 


Statistic _ Value 

Trials  600 

Mean  6.42 

Range  Minimum  6.05 
Range  Maximum  7.07 
Range  Width  1.02 


600  Trials 

1.000^ 


Forecast:  Total  CM  Actions  Required  Top  120 
Cumulative  Chart 
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. llllllllllll 


6.45 

Actions  /  yr 
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Figure  14.  Crystal  Ball  ®_  Output  results 
Thus  based  upon  these  simulation  input  parameters  the  steady 
state  CM  failure  rate  forecasts  can  be  expected  (with  100%  certainty)  to 
fluctuate  by  no  more  than  15.88%  of  the  mean  value  (range  width  of  1.02 
with  a  mean  of  6.42).  With  these  figures  it  can  be  said  that  most  errors  in 
MTBF  and  K-factors  alone  cannot  explain  the  divergence  issues  in  relation 
to  EVA  forecasted  versus  actual  maintenance  requirements  but  more 
likely  is  a  combination  of  MTBF  &  K-Factor  uncertainty  and  inappropriate  p 
values  when  modeling  wear-out  failures. 
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IV.  RECOMMENDATIONS  AND  CONCLUSIONS 


A.  RECOMMENDATION  FOR  CMAM  USE 

The  Comparative  Maintenance  Analysis  Tool  (CMAM)  is  a  user-friendly 
alternative  to  RMAT  for  executing  basic  failure  rate  analysis  on  Orbital 
replacement  units  for  the  ISS.  It  can  provide  immediate  feedback  for  logistics 
planners  on  estimates  of  both  corrective  and  preventative  maintenance 
requirements  for  both  internal  and  external  ORUs.  CMAM  is  not  meant  as  a 
replacement  to  the  robust  capabilities  of  RMAT,  nor  has  the  program  been 
independently  validated  to  ensure  that  its  results  are  completely  accurate. 
However,  when  validated  it  will  provide  a  readily  available  tool  for  RMAT 
comparison  that  allows  for  clarification/simplification  of  the  algorithms  through 
which  failure  rate  calculations  are  made.  Therefore,  it  is  recommended  that 
NASA  L&M  and  Boeing  L&M  consider  validating  CMAM  for  use  as  a  reference 
tool  when  forecasts  from  RMAT  are  either  not  needed,  or  when  only  basic  failure 
rate  data  is  required  for  planning  purposes. 

Following  this  recommendation,  the  CMAM  ORU  database  should  be 
completed  and  updated  in  order  to  keep  CMAM  output  results  as  accurate  and 
complete  as  possible,  and  to  allow  for  a  more  meaningful  comparison  with  RMAT 
results.  Completion  of  this  database  is  estimated  to  take  between  35  and  45 
hours  of  work  executed  from  the  CMAM  database  input  mask  within  the  CMAM 
program.  Completion  and  usage  of  an  Access-based  database  as  opposed  to 
the  current  Excel-based  MADS  listing  will  reduce  the  overall  number  of  input 
errors  into  the  ORU  database  and  will  most  likely  reduce  the  amount  of  time 
required  to  both  maintain  the  database,  and  the  amount  of  time  required  to 
format  both  CMAM  and  RMAT  input  files  (through  the  cut  and  paste  of  Access 
SQL  query  results). 
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B.  RECOMMENDATION  FOR  FOLLOW-ON  RESEARCH 

Based  upon  the  results  of  the  direct  comparison  between  RMAT  and 
CMAM  it  is  recommended  that  the  effects  of  reduced  p  values  in  reference  to 
Weibull  wear-out  failures  rates,  as  well  as  the  effects  of  increased  EVA 
MTBPMRR  be  explored  in  RMAT  to  determine  the  overall  effects  on  failure  rate 
forecasts,  especially  in  reference  to  short  MTBF  external  ORUs. 

Lastly,  it  is  recommended  that  further  research  be  conducted  to  find  out  if 
a  program  such  as  CMAM  can  be  applied  to  such  areas  as  forecasting  failures  of 
submarine  components  to  optimize  sparing  and/or  maintenance  scheduling.  This 
research  could  take  significant  time  in  terms  of  populating  new  spare  part 
databases  with  the  appropriate  reliability  data  but  could  provide  a  better 
forecasting  tool  than  what  is  currently  in  use. 

C.  CONCLUSION 

The  International  Space  Station  has  a  unique  Logistics  and  Maintenance 
system  that  requires  the  efficient  and  effective  forecasting  of  part  failures  and 
associated  resource  requirements.  Due  to  the  complexity  of  the  ISS  as  a 
system,  and  the  environment  in  which  it  and  its  crew  operates,  forecasting  these 
failures  is  often  as  much  an  art  as  it  is  a  science.  Although  the  primary  tool  for 
executing  ORU  failure  rate  forecasts  (RMAT)  is  a  powerful  analytical  and 
simulation  based  program,  it,  just  like  any  other  probability  forecasting  tool,  has 
its  own  set  of  inefficiencies,  inaccuracies  and  weaknesses.  It  is  of  primary 
importance  to  identify  these  weaknesses  and  their  causes  as  quickly  as  possible. 
The  growing  divergence  issue  between  external  ORU  forecasted  failures  and 
actual  failures  is  an  issue  that  deserves  attention  and  correction.  This  thesis  is 
an  attempt  to  analyze  this  issue  and  its  underlying  causes.  It  is  believed  that, 
after  studying  the  underlying  failure  rate  calculation  algorithms  of  RMAT  and 
developing  an  independent  program  that  replicates  some  of  these  calculations 
(CMAM),  the  underlying  problem  is  a  combination  of  multiple  factors.  The 
primary  factors  are; 

•  RMAT  and  CMAM  are  utilizing  Weibull  shape  parameter  (P)  values 
that  are  too  high  in  relation  to  wear-out  failure  forecasting 
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•  Inherent  uncertainty  in  the  accuracy  of  ORU  MTBF  and  K-factor 
values  that  tends  to  lead  to  inaccurate  failure  forecasts  rates, 
especially  when  looking  at  a  relatively  small  set  of  ORUs  (120  for 
this  analysis)  over  a  relatively  short  period  of  time  (approximately  6 
years  since  first  assembly  flight). 
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APPENDIX  A.  CMAM  DATABASE 


A.  INTRODUCTION 

The  CMAM  ORU  database  serves  as  the  source  of  all  ORU  reliability  data 
ORU  failure  rate  calculation.  It  provides  a  way  to  duplicate  the  majority  of  the 
information  within  the  NASA  R&M  Modeling  Analysis  Data  Set  (MADS). 
However,  the  CMAM  ORU  database  is  not  as  comprehensive  as  MADS.  For 
each  ORU  MADS  contains  59  separate  fields,  where-as  the  CMAM  ORU 
database  has  only  26  fields  per  ORU.  However,  the  CMAM  database  is  a 
relationship  database  with  3  separate  but  dynamically  linked  tables  that  can  be 
updated  from  the  CMAM  user  interface. 

B.  ASSUMPTIONS  AND  REQUIREMENTS 
Stakeholders 

The  stakeholders  in  the  database  are  the  primary  NASA  L&M  planners, 
NASA  R&M  personnel,  and  Boeing  L&M  personnel. 

Query  Requirements 

The  primary  query  requirements  focus  on  breaking  down  ORU  failure  rate 
forecasts  by  ISS  assembly  flight,  and  by  ISS  operational  year.  However,  NASA 
L&M  staff  often  has  analysis  requirements  that  require  database  drill  down 
capability  down  to  the  individual  ORU  level.  Therefore  the  following  ORU  query 
types  have  been  preprogrammed  into  CMAM: 

•  Search  by  Assembly  Flight 

•  Search  by  ISS  Operational  Year 

•  Search  by  ISS  system 

•  Search  by  ORU  Name 

•  Search  by  Internal/External  Component 

Further  database  querying  is  accomplished  through  SQL  formatting  in 
Microsoft  Access  ®  and  then  inputted  into  the  CMAM  program. 

C.  RELATIONS,  RELATIONSHIPS,  AND  CONSTRAINTS 

The  CMAM  ORU  database  design  was  executed  through  a  series  of 
iterative  improvements  to  increase  functionality/updatability  through  the  CMAM 
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user  interface.  To  reduce  implementation  problems  three  database  design  tools 
were  developed: 

•  Entity  Relationship  Diagram  (ERD) 

•  Table/Column  Diagram 

•  Microsoft  Access  Relationship  diagram 

Entity  Relationship  Diagram  (ERD) 

The  ERD  is  a  graphical  schematic  used  to  represent  database  entities  and 
their  relationships.  Entities  are  shown  in  rectangles  while  relationships  are 
shown  in  diamonds.  Cardinalities  between  entities  are  within  the  diamonds. 
Each  entity  has  a  number  of  attributes  that  describe  it  (i.e.  the  entity  Flight  has 
the  attributes  of:  Flight_Num  and  Flight_Date).  Lastly,  relationships  bridge  the 
gap  between  entities.  Each  relationship  has  within  it  a  minimum  and  a  maximum 
cardinality,  which,  in  a  binary  relationship,  identifies  the  number  of  elements 
allowed  on  each  side  of  the  relationship.  CMAM  has  three  such  relationships 
that  enhance  the  level  of  granularity  of  a  users  database  search.  See  Figure  15. 

Table/Column  Diagram 

The  table/column  diagram  was  then  constructed  to  ensure  that  the 
corresponding  tables  and  columns  relevant  to  our  ERD  were  ready  for  entry  into 
Microsoft  Access  ®  database  design.  Primary  keys  for  each  table  were 
identified,  along  with  ensuring  functional  dependency  of  each  non-primary 
attribute  (The  tables  were  normalized).  See  Figure  16. 

Microsoft  Access  Reiationships 

Lastly,  the  table/column  diagram  was  translated  into  the  Access®  design 
view  and  the  relationships  were  linked.  One  of  the  key  aspects  of  the  CMAM 
ORU  database  is  that  each  table  and  each  attribute  has  specific  input 
requirements  (i.e.  a  field  that  requires  a  number  will  not  accept  a  letter,  etc),  and 
referential  integrity  exits  between  the  tables  (i.e.  you  cannot  add  an  ORU  on  a 
flight  that  doesn’t  exist).  These  qualities  ensure  both  data  accuracy  and  integrity 
on  a  much  higher  level  than  spreadsheets  databases  (MADS  is  an  excel 
spreadsheet  based  database).  See  Figure  17. 
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Figure  15.  CMAM  ORU  Database  ERD 
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ORU  List 
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Figure  16.  CMAM  ORU  Database  Table/Column  Diagram 
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Figure  17.  CMAM  ORU  Microsoft  Access  Relationship 
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D.  USER  INTERFACE 

The  CMAM  ORU  database  interface  is  through  two  separate  forms  that 
allow  for  the  dynamic  update  of  both  the  ORUlist  table  and  the  Flight  table.  The 
flight  table  input  mask  offers  the  added  functionality  of  allowing  standard  dates  to 
be  entered  (MM/DD/YY)  and  automatically  converting  them  to  decimal  dates. 
(See  Figures  18  and  19). 
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Figure  18.  CMAM  ORULIST  Table  User  Interface 
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Figure  19.  CMAM  FLIGHT  Table  User  Interface 
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APPENDIX  B.  LIST  OF  ABBREVIATIONS  AND  ACRONYMS 


AC 

Assembly  Complete 

ACDC 

Assembly  Complete  Duty  Cycle 

CMAM 

Comparative  Maintenance  Assessment  Tool 

CMC 

Control  Moment  Gyro 

CM 

Corrective  Maintenance 

C1 

Criticality  Code  1 

C&T 

Command  &  Telemetry 

CSRR 

Crew  Size  Removal  and  Replace 

DC 

Duty  Cycle 

DECR 

Decrement 

ECSCM 

External  Crew  Size  Corrective  Maintenance 

EICM 

External/Internal  Corrective  Maintenance 

EMST 

External  Maintenance  Solution  Team 

ERD 

Entity  Relationship  Diagram 

EVA 

Extra-Vehicular  Activity 

Flt_Qty 

Flight  Quantity 

Flight_No 

Flight  Number 

GAO 

General  Accounting  Office 

GUI 

Graphic  User  Interface 

GRND 

Ground 

ICSCM 

Internal  Crew  Size  Corrective  Maintenance 

ISS 

International  Space  Station 

ITCS 

Internal  Thermal  Control  System 
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IVA 

Intra-Vehicular  Activity 

IVR 

Intra-Vehicular  Robotics 

JSC 

Johnson  Space  Center 

KSC 

Kennedy  Space  Center 

LIFLIM 

Life  Limit 

Lchar 

Life  Characteristic 

L&M 

Logistics  and  Maintenance 

LSAR 

Logistics  Supportability  Assessment  Report 

MADS 

Modeling  Analysis  Data  Set 

MTBF 

Mean  Time  Between  Failures 

MTBMAtotal 

Mean  Time  Between  Maintenance  Actions  total 

MTBPMRR 

Mean  Time  Between  Preventative  Maintenance  Remove  and 

Replace 

MTTF 

Mean  Time  To  Fail 

MTTR 

Mean  Time  To  Repair 

NASA 

National  Aeronautics  and  Space  Administration 

OEM 

Original  Equipment  Manufacturer 

OP 

Operating  ratio 

ORU 

Orbital  Replacement  Unit 

PM 

Preventative  Maintenance 

PMRR 

Preventative  Maintenance  Remove  and  Replace 

RMAT 

Reliability  and  Maintainability  Assessment  Tool 

R&M 

Reliability  and  Maintenance 

ROBMTTR 

Robotic  Mean  Time  To  Repair 
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SQL 

Structured  Query  Language 

START 

Station  Availability  Reporting  Tool 

THC 

Temperature  and  Humidity  Control 

USA 

United  Space  Alliance 

VB.net 

Visual  Basic.net 
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